Learning via Gaussian Herding
نویسندگان
چکیده
We introduce a new family of online learning algorithms based upon constraining the velocity flow over a distribution of weight vectors. In particular, we show how to effectively herd a Gaussian weight vector distribution by trading off velocity constraints with a loss function. By uniformly bounding this loss function, we demonstrate how to solve the resulting optimization analytically. We compare the resulting algorithms on a variety of real world datasets, and demonstrate how these algorithms achieve state-of-the-art robust performance, especially with high label noise in the training data.
منابع مشابه
Compact Convex Projections
We study the usefulness of conditional gradient like methods for determining projections onto convex sets, in particular, projections onto naturally arising convex sets in reproducing kernel Hilbert spaces. Our work is motivated by the recently introduced kernel herding algorithm which is closely related to the Conditional Gradient Method (CGM). It is known that the herding algorithm converges ...
متن کاملThe speed of sequential asymptotic learning
In the classical herding literature, agents receive a private signal regarding a binary state of nature, and sequentially choose an action, after observing the actions of their predecessors. When the informativeness of private signals is unbounded, it is known that agents converge to the correct action and correct belief. We study how quickly convergence occurs, and show that it happens more sl...
متن کاملHerding: Driving Deterministic Dynamics to Learn . . .
OF THE DISSERTATION Herding: Driving Deterministic Dynamics to Learn and Sample Probabilistic Models By Yutian Chen Doctor of Philosophy in Computer Science University of California, Irvine, 2013 Professor Max Welling, Chair The herding algorithm was recently proposed as a deterministic algorithm to learn Markov random fields (MRFs). Instead of obtaining a fixed set of model parameters, herding...
متن کاملOn Herding in Deep Networks
Maximum likelihood learning in Markov Random Fields (MRFs) with multiple layers of hidden units is typically performed using contrastive divergence or one of its variants. After learning, samples from the model are generally used to estimate expectations under the model distribution. Recently, Welling proposed a new approach to working with MRFs with a single layer of hidden units. The approach...
متن کاملHerding as a Learning System with Edge-of-Chaos Dynamics
Herding defines a deterministic dynamical system at the edge of chaos. It generates a sequence of model states and parameters by alternating parameter perturbations with state maximizations, where the sequence of states can be interpreted as “samples” from an associated MRF model. Herding differs from maximum likelihood estimation in that the sequence of parameters does not converge to a fixed ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010